Warning: file_put_contents(aCache/aDaily/post/opendatascience/-2330-2331-): Failed to open stream: No space left on device in /var/www/tg-me/post.php on line 50
Data Science by ODS.ai 🦜 | Telegram Webview: opendatascience/2330 -
Telegram Group & Telegram Channel
βš™οΈ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.

Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub, designed for training and evaluation of LLMs in software engineering.

Main features of the system:
1️⃣ Automatic data collection: Continuously extracts issue-PR pairs from Python repositories.
2️⃣ LLM-based environment setup: LLM analyzes repositories, creates install instructions, and updates them if errors happen.
3️⃣ Execution-based validation: Each task is tested by automatic setup, test run, and dependency freezing to make it reproducible.
4️⃣ LLM quality annotation: Tasks are labeled for clarity, difficulty, and test correctness to support filtering.

Result:
SWE-rebench dataset: 21,000+ ready-to-use interactive tasks.
Continuous updates: Fresh data is added regularly.
Transparent evaluation: Tasks are used for public SWE-rebench leaderboard.

πŸš€ SWE-rebench gives researchers and developers real and validated tasks to work with LLMs in SWE field.

Technical report: arXiv
Dataset: SWE-rebench



tg-me.com/opendatascience/2330
Create:
Last Update:

βš™οΈ SWE-rebench: Nebius AI R&D team presents new dataset for SWE tasks.

Researchers built an automated system to collect and validate thousands of real-world tasks from GitHub, designed for training and evaluation of LLMs in software engineering.

Main features of the system:
1️⃣ Automatic data collection: Continuously extracts issue-PR pairs from Python repositories.
2️⃣ LLM-based environment setup: LLM analyzes repositories, creates install instructions, and updates them if errors happen.
3️⃣ Execution-based validation: Each task is tested by automatic setup, test run, and dependency freezing to make it reproducible.
4️⃣ LLM quality annotation: Tasks are labeled for clarity, difficulty, and test correctness to support filtering.

Result:
SWE-rebench dataset: 21,000+ ready-to-use interactive tasks.
Continuous updates: Fresh data is added regularly.
Transparent evaluation: Tasks are used for public SWE-rebench leaderboard.

πŸš€ SWE-rebench gives researchers and developers real and validated tasks to work with LLMs in SWE field.

Technical report: arXiv
Dataset: SWE-rebench

BY Data Science by ODS.ai 🦜





Share with your friend now:
tg-me.com/opendatascience/2330

View MORE
Open in Telegram


Data Science by ODS ai 🦜 Telegram | DID YOU KNOW?

Date: |

China’s stock markets are some of the largest in the world, with total market capitalization reaching RMB 79 trillion (US$12.2 trillion) in 2020. China’s stock markets are seen as a crucial tool for driving economic growth, in particular for financing the country’s rapidly growing high-tech sectors.Although traditionally closed off to overseas investors, China’s financial markets have gradually been loosening restrictions over the past couple of decades. At the same time, reforms have sought to make it easier for Chinese companies to list on onshore stock exchanges, and new programs have been launched in attempts to lure some of China’s most coveted overseas-listed companies back to the country.

How to Use Bitcoin?

n the U.S. people generally use Bitcoin as an alternative investment, helping diversify a portfolio apart from stocks and bonds. You can also use Bitcoin to make purchases, but the number of vendors that accept the cryptocurrency is still limited. Big companies that accept Bitcoin include Overstock, AT&T and Twitch. You may also find that some small local retailers or certain websites take Bitcoin, but you’ll have to do some digging. That said, PayPal has announced that it will enable cryptocurrency as a funding source for purchases this year, financing purchases by automatically converting crypto holdings to fiat currency for users. β€œThey have 346 million users and they’re connected to 26 million merchants,” says Spencer Montgomery, founder of Uinta Crypto Consulting. β€œIt’s huge.”

Data Science by ODS ai 🦜 from fr


Telegram Data Science by ODS.ai 🦜
FROM USA